fix: preserve operand width in DecimalValue checked arithmetic#7380
Conversation
…thmetic (vortex-data#7022) Signed-off-by: Abanoub Doss <abanoub.doss@gmail.com>
|
I256 is not the maximum of the inputs. I would accept a PR doing this |
|
This PR has been marked as stale because it has been open for 14 days with no activity. Please comment or remove the stale label if you wish to keep it active, otherwise it will be closed in 7 days |
|
Hi @joseph-isaacs, sorry not sure I understood - are you saying this change is fine? If so, what would the process be to get this approved / merged? |
joseph-isaacs
left a comment
There was a problem hiding this comment.
Thanks for this sorry, for delay just a few small ones and we are good to go
…rt widening cast Address review feedback on vortex-data#7380: - Rename `checked_binary_op!` to `checked_widening_binary_op!` to make the upcast to the wider operand type explicit at call sites. - Replace `cast()?` with `vortex_expect`, since widening a `DecimalValue` to a type at least as wide as itself cannot fail. Signed-off-by: Abanoub Doss <abnobdoss@proton.me> Signed-off-by: Claude <noreply@anthropic.com>
Signed-off-by: Abanoub Doss <abnobdoss@proton.me> Signed-off-by: Claude <noreply@anthropic.com>
Improve error handling in decimal checked operations
|
Thank you @joseph-isaacs! I've updated the PR per the comments. |
Signed-off-by: abnobdoss <abanoubdoss@gmail.com>
|
Hi @joseph-isaacs, could you please take a look when you get a chance? |
|
This PR has been marked as stale because it has been open for 14 days with no activity. Please comment or remove the stale label if you wish to keep it active, otherwise it will be closed in 7 days |
|
Hi @joseph-isaacs, could you please take a look when you get a chance? |
Merging this PR will improve performance by 24.74%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | take_10k_random |
255.8 µs | 197.7 µs | +29.39% |
| ⚡ | Simulation | take_10k_contiguous |
276.3 µs | 218.1 µs | +26.65% |
| ⚡ | Simulation | patched_take_10k_contiguous_patches |
291 µs | 232.1 µs | +25.38% |
| ⚡ | Simulation | patched_take_10k_random |
303 µs | 244 µs | +24.18% |
| ⚡ | WallTime | cuda/bitpacked_u8/unpack/3bw[100M] |
353.5 µs | 298.6 µs | +18.38% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing abnobdoss:fix/decimal-checked-add-no-upcast (726bfa2) with develop (69ce1ed)
Footnotes
-
1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports. ↩
Summary
Closes: #7022
DecimalValue::checked_add/sub/mul/divunconditionally upcast both operands toi256and returnedDecimalValue::I256, producing unnecessarily wide scalars even when both inputs were narrow (e.g.I32 + I32 → I256).Operate at
max(self, other)width instead, matching the pattern inaggregate_fn/fns/sum/decimal.rs. The oldchecked_binary_ophelper was replaced with a local macro so each op dispatches with its own trait.AI disclosure: Analysis, implementation, and tests were done with Claude Code under my direction and review.
API Changes
No public API signature changes (verified via
./scripts/public-api.sh).Overflow is now caught at the target width rather than silently widening (e.g.
I8(i8::MAX) + I8(1)now returnsNoneinstead ofI256(128)). This felt like the most faithful reading of the issue, but I'd appreciate a sanity check that returningNoneon target-width overflow is the desired semantics rather than, say, promoting to the next wider variant. No in-tree caller depends on the old behavior:sum/mod.rspre-widens the accumulator by +10 precision,decimal/scalar.rs::checked_binary_numericrequires both operands to share the same width, andsum/constant.rsusesI128as the multiplier.Testing
test_decimal_value_checked_*to assert the correct variant.i8::MIN / -1).